Constructing a Norwegian Academic Wordlist
نویسندگان
چکیده
We present the development of a Norwegian Academic Wordlist (AKA list) for the Norwegian Bokmål variety. To identify specific academic vocabulary we developed a 100-million-word academic corpus based on the University of Oslo archive of digital publications. Other corpora were used for testing and developing general word lists. We tried two different methods, those of Carlund et al. (2012) and Gardner & Davies (2013), and compared them. The resulting list is presented on a web site, where the words can be inspected in different ways, and freely downloaded.
منابع مشابه
The Signed Languages of Eastern Europe
1. Purpose and scope 2. General survey methodology 3. Qualitative information 3.1 Eastern Europe 3.2 Bulgaria 3.3 Czech Republic 3.4 Estonia 3.5 Hungary 3.6 Latvia 3.7 Lithuania 3.8 Moldova 3.9 Poland 3.10 Romania 3.11 Russia 3.12 Slovakia 3.13 Ukraine 3.14 Republics and provinces of the former Yugoslavia: Bosnia and Herzegovina, Croatia, Kosovo, Macedonia, Montenegro, Serbia, Slovenia, Voivodi...
متن کاملThe Development of a Temporal Information Dictionary for Social Media Analytics
Dictionaries have been used to analyse text even before the emergence of social media and the use of dictionaries for sentiment analysis there. While dictionaries have been used to understand the tonality of text, so far it has not been possible to automatically detect if the tonality refers to the present, past, or future. In this research, we develop a dictionary containing time-indicating wo...
متن کاملModeling and Encoding Traditional Wordlists for Machine Applications
This paper describes work being done on the modeling and encoding of a legacy resource, the traditional descriptive wordlist, in ways that make its data accessible to NLP applications. We describe an abstract model for traditional wordlist entries and then provide an instantiation of the model in RDF/XML which makes clear the relationship between our wordlist database and interlingua approaches...
متن کاملCredibility: Norwegian Students Evaluate Media Studies Web Sites
This paper investigates Norwegian university students’ evaluations of web site credibility and site authors’ vested interests with respect to a textbased academic site and an informational site with commercial support. Credibility ratings were higher for some aspects of the academic site even though the non-academic sit was rated more highly in presentation design and currency. Negative correla...
متن کاملComparison of the South African Spondaic and CID W-1 wordlists for measuring speech recognition threshold
BACKGROUND The home language of most audiologists in South Africa is either English or Afrikaans, whereas most South Africans speak an African language as their home language. The use of an English wordlist, the South African Spondaic (SAS) wordlist, which is familiar to the English Second Language (ESL) population, was developed by the author for testing the speech recognition threshold (SRT) ...
متن کامل